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Abstract 

We give an 0(iV-logiV-2°(i°s* algorithm for multiplying two A^-bit inte gers that improves 
the 0{N ■ \ogN ■ log log iV) algorithm by Schonhage-Strassen [SS71| . Both these algorithms use 
modular arithmetic. Recently, Fiirer [Fiir07j gave an 0{N -logN ■ 20(i°g* ^)) algorithm which 
however uses arithmetic over complex numbers as opposed to modular arithmetic. In this paper, 
we use multivariate polynomial multiplication along with ideas from Fiirer's algorithm to achieve 
this improvement in the modular setting. Our algorithm can also be viewed as a j>-adic version 
of Fiirer's algorithm. Thus, we show that the two seemingly different approaches to integer 
multiplication, modular and complex arithmetic, are similar. 

1 Introduction 

Computing the product of two A^-bit integers is an important problem in algorithmic number 
theory and algebra. A naive approach leads to an algorithm that uses 0{N'^) bit operations. 
Karatsuba |K063] showed that some multiplication operations of such an algorithm can be replaced 
by less costly addition operations which reduces the overall running time of the algorithm to 
0(A^°S2 s-j operations. Shortly afterwards this result was improved by Toom |Too63j who showed 
that for any e > 0, integer multiplication can be done in 0(A^^^) time. This led to the question as 
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to whether the time complexity can be improved further by replacing the term 0(A^^) by a poly- 
logarithmic factor. In a major breakthrough, Schonhage and Strassen |SS71] gave two efficient 
algorithms for multiplying integers using fast polynomial multiplication. One of the algorithms 
achieved a running time of 0(iV • log iV • log log TV . . . 2°^^°^* ^)) using arithmetic over complex 
numbers (approximated to suitable precision), while the other used arithmetic modulo carefully 
chosen integers to improve the complexity further to 0{N ■ log • log log N) . Despite many efforts, 
the modular algorithm remained the best until a recent remarkable result by Fiirer [Fur07j . Fiirer 
gave an algorithm that uses arithmetic over complex numbers and runs in 0(Ar-logiV-2^(i°g* 
time. Till date this is the best time complexity result known for integer multiplication. 

Schonhage and Strassen introduced two seemingly different approaches to integer multiplica- 
tion - using complex and modular arithmetic. Fiirer's algorithm improves the time complexity 
in the complex arithmetic setting by cleverly reducing some costly multiplications to simple shift. 
However, the algorithm needs to approximate the complex numbers to certain precisions during 
computation. This introduces the added task of bounding the total truncation errors in the analysis 
of the algorithm. On the contrary, in the modular setting the error analysis is virtually absent. 
In addition, modular arithmetic gives a discrete approach to a discrete problem like integer mul- 
tiplication. Therefore it is natural to ask whether we can achieve a similar improvement in time 
complexity of this problem in the modular arithmetic setting. In this paper, we answer this ques- 
tion affirmatively. We give an 0{N ■ logiV • 2'^('°s ^-') algorithm for integer multiplication using 
modular arithmetic, thus matching the improvement made by Fiirer. 

Overview of our result 

As is the case in both Schonhage-Strassen's and Fiirer's algorithms, we start by reducing the prob- 
lem to polynomial multiplication over a ring TZ by properly encoding the given integers. Polynomials 
can be multiplied efficiently using Discrete Fourier Transforms (DFT), which uses special roots of 
unity. For instance, to multiply two polynomials of degree less than M using the Fourier transform, 
we require a principal 2M-th root of unity (see Definition 12.11 for principal root). An efficient way 
of computing the DFT of a polynomial is through the Fast Fourier Transform (FFT). In addition, 
if multiplications by these roots are efficient, we get a faster algorithm. Since multiplication by 2 
is a shift, it would be good to have a ring with 2 as a root of unity. One way to construct such 
a ring in the modular setting is to consider rings of the form TZ = Z/(2*'^ -|- 1)Z as is the case in 
Schonhage and Strassen |SS7lj . However, this makes the size of TZ equal to 2^, which although 
works in case of Schonhage and Strassen's algorithm, is a little too large to handle in our case. We 
would like to find a ring whose size is bounded by some polynomial in M and which also contains 
a principal 2M-th root of unity. In fact, it is this choice of ring that poses the primary challenge in 
adapting Fiirer's algorithm and making it work in the discrete setting. In order to overcome this 
hurdle we choose the ring toheTZ = Z/p^Z, for a prime p and a constant c such that = poly{M). 
The ring 'Ljp'^TL, has a principal 2M-th root of unity if and only if 2M divides p — 1, which means 
that we need to search for a prime p in the arithmetic progression {1 -|- i • 2M}-^q. To make this 
search computationally efficient, we need the degree of the polynomials M to be sufficiently small 
compared to the input size. It turns out that this can be achieved by considering multivariate poly- 
nomials instead of univariate polynomials. We use enough variables to make sure that the search 
for such a prime does not affect the overall running time; the number of variables finally chosen is 
a constant as well. In fact, the use of multivariate polynomial multiplications and a small ring are 
the main steps where our algorithm differs from earlier algorithms by Schonhage-Strassen and Fiirer. 
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The use of inner and outer DFT plays a central role in both Fiirer's as well as our algo- 
rithm. Towards understanding the notion of inner and outer DFT in the context of multivariate 
polynomials, we present a group theoretic interpretation of Discrete Fourier Transform (DFT). 
Arguing along the line of Fiirer |Fiir07j we show that repeated use of efficient computation of inner 
DFT's using some special roots of unity in TZ makes the overall process efficient and leads to an 
0{N • logiV • 2'^(i°s*^)) time algorithm. 



2 The Ring, the Prime and the Root of Unity 

We work with the ring TZ = 'Z[a]/{p^,a"^ + 1) ^or some m, a constant c and a prime p. Elements 
of TZ are thus m — 1 degree polynomials over a with coefficients from Z/p'^Z. By construction, a is 
a 2m-th root of unity and multiplication of any element in TZ by any power of a can be achieved 
by shifting operations — this property is crucial in making some multiplications in the FFT less 
costly (Section |4.2[) . 

Given an A^-bit number a, we encode it as a fc-variate polynomial over TZ with degree in each 
variable less than M. The parameters M and m are powers of two such that is roughly ^^^^ 
and m is roughly log A. The parameter k will ultimately be chosen a constant (see Section [5]). We 
now explain the details of this encoding. 

2.1 Encoding Integers into multivariate Polynomials 

Given an A-bit integer a, we first break these A bits into blocks of roughly jjj: bits each. This 
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corresponds to representing a in base q = 2^ . Let a = ao + . . . + a^/^.^g*^ ^ where ai < q. The 
number a is converted into a polynomial as follows: 

1. Express i in base M as i = ii + + • • • + ij^M^~^ . 

2. Encode each term ajg* as the monomial Oj • Aj^ ■ ■ ■ X''^ . As a result, the number a gets 

converted to the polynomial Yl^=o~^ ' -^i^ ' ' ' -^k" ■ 

Further, we break each ai into equal sized blocks where the number of bits in each block 
is u = jyjF"- Each coefficient is then encoded as polynomial in a of degree less than y. The 
polynomials are then padded with zeroes to stretch their degrees to m. Thus, the A^-bit number a 
is converted to a A;-variate polynomial a{X) over Z[a]/(a™ + 1). 

Given integers a and b, each of A bits, we encode them as polynomials a{X) and 6(A) and 
compute the product polynomial. The product a • b can be recovered by substituting Xs = q^" , 
for 1 < s < /c, and a = 2^* in the polynomial a{X) ■ 6(A). The coefficients in the product 
polynomial could be as large as ■ m ■ 2^" and hence it is sufficient to do arithmetic modulo 
p*^ where p'^ > ■ m ■ 2^". Therefore, a(A) can indeed be considered as a polynomial over 
TZ = Z[a]/(p^, a*" + !)• Our choice of the prime p ensures that c is in fact a constant (see Section [5]). 

2.2 Choosing the prime 

The prime p should be chosen such that the ring TLjp'^TL has a principal 2M-th root of unity, which 
is required for polynomial multiplication using FFT. A principal root of unity is defined as follows. 
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Definition 2.1. An n-th root of unity ( TZ is said to be primitive if it generates a cyclic group 
of order n under multiplication. Furthermore, it is said to he principal if n is coprime to the 
characteristic ofTZ and ( satisfies X^^Jq^ C*"' = for clU < j < n. 

In Z/p'^Z, a 2M-tli root of unity is principal if and only if 2M \ p — 1 (see also Section [6]). As a 
result, we need to choose the prime p from the arithmetic progression {1 + i • 2M}-^q, which is the 
main bottleneck of our approach. We now explain how this bottleneck can be circumvented. 

An upper bound for the least prime in an arithmetic progression is given by the following 
theorem |Lin44] : 

Theorem 2.2 (Linnik). There exist absolute constants I and L such that for any pair of coprime 
integers d and n, the least prime p such that p = d mod n is less than in^. 

Heath-Brown |HB92| showed that the Linnik constant L < 5.5. Recall that M is chosen such 
that M'^ is Q ( iq^at ) • choose A; = 1, that is if we use univariate polynomials to encode 

integers, then the parameter M = Q (^ ^o^ ' -^^^^^ least prime p = 1 (mod 2Af) could be as 

large as N^. Since all known deterministic sieving procedures take at least A^^ time this is clearly 
infeasible (for a randomized approach see Section l5.ip . However, by choosing a larger k we can 
ensure that the least prime p = 1 (mod 2M) is 0{N^) for some constant e < 1. 

Remark 2.3. If A; is any integer greater than L + 1, then = O ^A^i+i ^ and hence the least 
prime p =1 mod 2M can be found in o{N) time. 



2.3 The Root of Unity 

We require a principal 2M-th root of unity in IZ to compute the Fourier transforms. This root p{a) 
should also have the property that its (^)-th power is a, so as to make some multiplications in 
the FFT efficient (Lemma 14. 9|) . Such a root can be computed by interpolation in a way similar to 
that in Fiirer's algorithm [FiirOTl Section 3], but we briefly sketch the procedure for completeness. 

We first obtain a (p — l)-th root of unity C in Z/p'^Z by lifting a generator of F*. The (^|^^-th 
power of C gives us a 2M-th root of unity w. A generator of F* can be computed by brute force, as 
p is sufficiently small. Having obtained a generator, we can use Hensel Lifting |NZM9ll Theorem 
2.23]. 

Lemma 2.4. Let C,s be a primitive {p — l)-th root of unity in TLjp^'L. Then there exists a unique 
primitive (jp — l)-th root of unity Qs+i in Jjjp^'^^TL such that C^+i = Cs (mod p*). This unique root 
is given by Cs+i = Cs - where f{X) = X'P-^ - 1. 

We need the following claims to compute the root p{a). 
Claim 2.5. Let to be a principal 2M-th root of unity in Z/p^Z. 

M 

(a) If a = ojm J then a is a principal 2m-th root of unity. 

(b) The polynomial x"^ + 1 = Ili^il^ ~ cr^*"^) in Z/p'^'Z. Moreover, for any < i < j < 2m, the 
ideals generated by (x — cr*) and {x — a^) are comaximal in Z[x]/p'^Z. 
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(c) The roots [a"^^-^] ^^.^^ are distinct modulo p and therefore the difference of any two of them 
is a unit in TL\x\l]f'L. 

We then, through interpolation, solve for a polynomial pip) such that /3((7^*+^) = for all 

\ < i < m. Then, 

^5(^2^+1) = a;2^+i \<i<m 

=^ (p(a))*^/" = a (moda-a2^+i) l<i<m 
=^ (p(a))*^/" = a (moda™ + l) 

The first two parts of the claim justify the Chinese Remaindering. Finally, computing a polynomial 
p{a) such that pia^^^^^ = w^*"^^ can be done through interpolation. 

The division by (cr^*"'"-^ — o"^-''*"^) is justified as it is a unit in TLjjf'L (part (c) of Claim [23]). 

3 The Integer Multiplication Algorithm 

We are given two integers a, 6 < 2^ to multiply. We fix constants k and c whose values are given 
in Section O The algorithm is as follows: 



.... ..X... ..I^ ~ ^ 

prime p = 1 (mod 2M) (Remark 12. 3p . 



1. Choose M and m as powers of two such that AI^ ~ ^ ^ and m ~ logA^. Find the least 



2. Encode the integers a and 6 as fc-variate polynomials a{X) and 6(X) respectively over the 
ring n = Z[a]/(p^a"" + 1) (Section El]). 

3. Compute the root p{a) (Section l2.3p . 

4. Use p{a) as the principal 2M-th root of unity to compute the Fourier transforms of the k- 
variate polynomials a{X) and b{X). Multiply component-wise and take the inverse Fourier 
transform to obtain the product polynomial. 

5. Evaluate the product polynomial at appropriate powers of two to recover the integer product 
and return it (Section 12. ip . 



The only missing piece is the Fourier transforms for multivariate polynomials. The following 
section gives a group theoretic description of EFT. 

4 Fourier Transform 

A convenient way to study polynomial multiplication is to interpret it as multiplication in a group 
algebra. 
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Definition 4.1 (Group Algebra). Let G be a group. The group algebra of G over a ring R is 
the set of formal sums YlgeG^ad '^^^''^^ ag £ R with addition defined point-wise and multiplication 
defined via convolution as follows 



,gh=u 



U 



Multiplying univariate polynomials over R of degree less than n can be seen as multiplication 
in the group algebra R[G] where G is the cyclic group of order 2n. Similarly, multiplying A;-variate 
polynomials of degree less than n in each variable can be seen as multiplying in the group algebra 
R[G^], where G^ denotes the fc-fold product group G x . . . x G. 

In this section, we study the Fourier transform over the group algebra R[E\ where E is an 
additive abelian group. Most of this, albeit in a different form, is well known but is provided here 
for completeness. |Sha99[ Chapter 17] 

In order to simplify our presentation, we will fix the base ring to be C, the field of complex 
numbers. Let n be the exponent of E, that is the maximum order of any element in E. A similar 
approach can be followed for any other base ring as long as it has a principal n-th root of unity. 

We consider C[E] as a vector space with basis {x}x^e and use the Dirac notation to represent 
elements of C[E] — the vector x in E, denotes the element 1.x of C[-E]. 

Definition 4.2 (Characters). Let E be an additive abelian group. A character of E is a homo- 
morphism from E to C* . 

An example of a character of E is the trivial character, which we will denote by 1, that assigns 
to every element of E the complex number 1. If xi and X2 are two characters of E then their 
product XI-X2 is defined as Xi-X2{x) = xi{x)X2{x). 

Proposition 4.3. JSha99\ Chapter 17, Theorem 1] Let E be an additive abelian group of exponent 
n. Then the values taken by any character of E are n-th roots of unity. Furthermore, the characters 
form a multiplicative abelian group E which is isomorphic to E. 

An important property that the characters satisfy is the following |Isa94l Corollary 2.14]. 

Proposition 4.4 (Schur's Orthogonality). Let E be an additive abelian group. Then 



Y^xi^ 

X6B 



fo ifx^l, 

I f^E otherwise 

'O ifxj^O, 

^E otherwise. 



It follows from Schur's orthogonality that the collection of vectors \x) = X^a;X(2^) l^;) forms a 
basis of C[S]. We will call this basis the Fourier basis of C[-E]. 

Definition 4.5 (Fourier Transform). Let E be an additive abelian group and let x ^ Xx be an 

isomorphism between E and E. The Fourier transform over E is the linear map from C[E] to C[E] 
that sends \x) to \xx) ■ 
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Thus, the Fourier transform is a change of basis from the point basis {|x)}2:g_B to the Fourier 
basis {\Xx)}xeE- The Fourier transform is unique only up to the choice of the isomorphism x x~ 
This isomorphism is determined by the choice of the principal root of unity. 



X ■ 



Remark 4.6. Given an element |/) G C[£'], to compute its Fourier transform it is sufficient to 
compute the Fourier coefficients {{x\f)}^^E- 

4.1 Fast Fourier Transform 

We now describe the Fast Fourier Transform for general abelian groups in the character theoretic 
setting. For the rest of the section fix an additive abelian group E over which we would like to 
compute the Fourier transform. Let A be any subgroup of E and let B = E/A. For any such pair 
of abelian groups A and B, we have an appropriate Fast Fourier transformation, which we describe 
in the rest of the section. 

Proposition 4.7. 1. Every character X of B can be "lifted" to a character of E (which will also 
be denoted by X) defined as follows X{x) = X{x + A). 

2. Let x\ o,nd X2 be two characters of E that when restricted to A are identical. Then xi = X2A 
for some character X of B. 

3. The group B is (isomorphic to) a subgroup of E with the quotient group E/B being (isomor- 
phic to) A. 

We now consider the task of computing the Fourier transform of an element |/) = ^ fx \ x) 
presented as a list of coefficients {fx} in the point basis. For this, it is sufficient to compute the 
Fourier coefficients {(xl/)} for each character x E (Remark 14. 6p . To describe the Fast Fourier 
transform we fix two sets of cosets representatives, one of ^ in and one of i? in as follows. 

1. For each b £ B, b being a coset of A, fix a coset representative Xf, £ E such b = Xf, + A. 

2. For each character of A, fix a character x^p of E such that Xip restricted to A is the character 
if. The characters {xip} form (can be thought of as) a set of coset representatives of B in E. 

Since {xb}beB forms a set of coset representatives, any |/) € C[E] can be written uniquely as 

I/) = Y.fb,a\xb + a). 

Proposition 4.8. Let \ f) = fb.a \xb + a) be an element ofC[E]. For each b £ B and ip £ A let 

1/5) £ C[A] and {f^p) £ C[B] be defined as follows. 



\h) = '^fb,a\a) 

\u) = Y.^p{xbmfb)\h) 



beB 



Then for any character x of E, which can be expressed as x = ' Xp: Fourier coefficient 

{x\f) = {MU). 
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Proof. Recall that \(x + A) = \{x), and is a restriction of the x to the subgroup A. 

ixlf) = ^^Xixb + a)fb,a 

b a 

= ^X{xb + a)J2x^i^b + a)fb,a 

b a 

= ^Hb)X^{xb)J2'^'^^'>fb,a 
b a 

b 

= (MU) 

□ 

We are now ready to describe the Fast Fourier transform given an element |/) = fx\x)■ 
l. For each b £ B compute the Fourier transforms of |/;,). This requires i^B many Fourier 
transforms over A. 

2. As a result of the previous step we have for each h £ B and (p £ A the Fourier coeffi- 
cients (<^|/b). Compute for each ip the vectors |/^) = X^tes 1^)- This requires 
^A.^B = many multiplications by roots of unity. 

3. For each ip G A compute the Fourier transform of \f^). This requires ^A = ^A many Fourier 
transforms over B. 

4. Any character x of -E is of the form for some € A and A € i?. Using Proposition 14.81 
we have at the end of Step [3] all the Fourier coefficients {x\f) = {Mf'p)- 

If the quotient group B itself has a subgroup that is isomorphic to A then we can apply this 
process recursively on B to obtain a divide and conquer procedure to compute the Fourier transform. 
In the standard FFT we use E = Z/2'^Z. The subgroup A is 2^~^E which is isomorphic to Z/2Z 
and the quotient group B is Z/2"^-'^Z. 

4.2 Analysis of the Fourier Transform 

Our goal is to multiply fc-variate polynomials over TZ, with the degree in each variable less than 
M. This can be achieved by embedding the polynomials into the algebra of the product group 
E = { 2M-z ) ^^'^ multiplying them as elements of the algebra. Since the exponent of E is 2M, we 
require a principal 2iVf-th root of unity in the ring TZ. We shall use the root p{a) (as defined in 
Section 12. 3p for the Fourier transform over E. 

For every subgroup A of E, we have a corresponding FFT. We choose the subgroup A as ( 2m-z ) ^ 
and let B be the quotient group E/A. The group A has exponent 2m and a is a principal 2m-th 
root of unity. Since a is a power of p(a), we can use it for the Fourier transform over A. As 
multiplications by powers of a are just shifts, this makes Fourier transform over A efficient. 

Let J-{2M, k) denote the complexity of computing the Fourier transform over {^ 2M-i) ■ 
have 



J^{2M, k) = (—\ T{2m, k) + M^Mn + {2mfT (—, k] 
\m J \m J 



(1) 
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where A4tz denotes the complexity of multiplications in TZ. The first term comes from the ^^i? many 
Fourier transforms over A (Step [1] of FFT), the second term corresponds to the multiplications by 
roots of unity (Step [2|) and the last term comes from the many Fourier transforms over B 
(Step ED. 

Since j4 is a subgroup of B as well, Fourier transforms over B can be recursively computed in a 
similar way, with B playing the role of E. Therefore, by simplifying the recurrence in Equation [T] 
we get: 

log m log m 



Ti2M,k) = 0( ^^l_ J^{2m,k)+ Mn] (2) 



Lemma 4.9. J^(2m, k) = 0{m^^^ log m • logp) 

Proof. The FFT over a group of size n is usually done by taking 2-point FFT's followed by ^-point 
FFT's. This involves 0(n log n) multiplications by roots of unity and additions in base ring. Using 
this method, Fourier transforms over A can be computed with 0(m-'^ log m) multiplications and 
additions in TZ. Since each multiplication is between an element of TZ and a power of q, this can 
be efficiently achieved through shifting operations. This is dominated by the addition operation, 
which takes 0(m logp) time, since this involves adding m coefficients from Z/p'^Z. □ 



Therefore, from Equation [21 



J^(2M, k)=0{ log M • m • logj) + ^^^^""^^ M-r. ) (3) 
* log m ' 



5 Complexity Analysis 

The choice of parameters should ensure that the following constraints are satisfied: 

1. = Q (j^^) and m = 0{\ogN). 

2. = 0{N'^) where L is the Linnik constant (Theorem 12. 2p and e is any constant less than 
1. Recall that this makes picking the prime by brute force feasible (see Remark 12. 3p . 

3. > ■ m ■ 2^" where u = This is to prevent overflows during modular arithmetic 
(see Section [27T]) . 

It is straightforward to check that k > L + 1 and c > 5(A; + 1) satisfy the above constraints. Heath- 
Brown |HB92j showed that L < 5.5 and therefore c = 42 clearly suffices. 

Let T[N) denote the time complexity of multiplying two bit integers. This consists of: 

(a) Time required to pick a suitable prime p, 

(b) Computing the root /9(a), 

(c) Encoding the input integers as polynomials, 

(d) Multiplying the encoded polynomials. 
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(e) Evaluating the product polynomial. 

As argued before, the prime p can be chosen in o{N) time. To compute p{a), we need to lift 
a generator of F* to Z /p'^Z followed by an interpolation. Since c is a constant and p is a prime of 
0(log A^) bits, the time required for Hensel Lifting and interpolation is o{N). 

The encoding involves dividing bits into smaller blocks, and expressing the exponents of q in 
base M (Section 12. ip and all these take 0{N) time since M is a power of 2. Similarly, evaluation 
of the product polynomial takes linear time as well. Therefore, the time complexity is dominated 
by the time taken for polynomial multiplication. 



Time complexity of Polynomial Multiplication 

From Equation [3l the complexity of Fourier transform is given by 

M'' logM ■m-logp+ — — —Mn 

logm 

Proposition 5.1. \Sch8^ Multiplication in the ringJZ reduces to multiplying 0(log^ N) bit integers 
and therefore Mn = T {0{\og^ N)) . 

Proof. Elements of IZ can be seen as polynomials in a over 'L/p'^'L with degree at most m. Given 
two such polynomials f{a) and g{a), encode them as follows: Replace a by 2^^, transforming the 
polynomials f{a) and g{a) to the integers /(2'^) and g{2'^) respectively. The parameter d is chosen 
such that the coefficients of the product h{a) = f{a)g{a) can be recovered from the product 
7(2'^) • g{2'^). For this, it is sufficient to ensure that the maximum coefficient of h{a) is less than 
2'^. Since / and g are polynomials of degree m, we would want 2"^ to be greater than m •p^'^, which 
can be ensured by choosing d = Q (log A^). The integers /(2°') and g{2'^) are bounded by 2'"'^ and 
hence the task of multiplying in TZ reduces to 0(log^ A^) bit integer multiplication. □ 

Multiplication of two polynomials involve a Fourier transform followed by component-wise mul- 
tiplications and an inverse Fourier transform. Since the number of component- wise multiplications 
is only ^ the time taken is ■ Mn which is clearly subsumed in J^{M,k). Therefore, the 
time taken for multiplying the polynomials is 0{J-{M,k)). Thus, the complexity of our integer 
multiplication algorithm T{N) is given by, 

r(A^) = 0{T{M,k)) 

= O M'^logM-m-logpH — - — Mn 

\ logm 

= o(N\ogN+ f r(0(log^Ar)) 

\ log A* • log log A* 

The above recurrence leads to the following theorem. 

Theorem 5.2. Given two N hit integers, their product can he computed inO{N- log A^ • 2°^^°^* ^)) 
time. 
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5.1 Choosing the Prime Randomly 



To ensure that the search for a prime p = 1 (mod 2M) does not affect the overah time complexity 
of the algorithm, we considered multivariate polynomials to restrict the value of M; an alternative 
is to use randomization. 

Proposition 5.3. Assuming ERH, a prime p such that p = 1 (mod 2M) can be computed by a 
randomized algorithm with expected running time 0(log^ M). 

Proof. Titchmarsh |Tit30| (see also Tianxin jTia90| ) showed, assuming ERH, that the number of 
primes less than x in the arithmetic progression {1 + i ■ 2M}j>o is given by, 

vr(x,2M) = -^^ + 0(Vilogx) 

for 2M < ■^^•(log where Li{x) = ©(j^l^) and (/? is the Euler totient function. In our case, since 
M is a power of two, ip{2M) = M, and hence for x > 4M^ -log^ M, we have 7r(x, 2M) = O 



M log x ^ 

Therefore, for an i chosen uniformly randomly in the range 1 < i < 2M-log*' M, the probability that 
i ■ 2M + 1 is a prime is at least for a constant d. Furthermore, primality test of an 0(log M) bit 

number can be done in O(log^M) time using Rabin- Miller primality test |Mil76llRab80j . Hence, 
with X = 4M^ • log^ M a suitable prime for our algorithm can be found in expected O(log^M) 
time. □ 



6 A Different Perspective 

Our algorithm can be seen as a p-adic version of Fiirer's integer multiplication algorithm, where the 
field C is replaced by Qp, the field of p-adic numbers (for a quick introduction, see Baker's online 
notes |Bak07| ) . Much like C, where representing a general element (say in base 2) takes infinitely 
many bits, representing an element in Qp takes infinitely many p-adic digits. Since we cannot work 
with infinitely many digits, all arithmetic has to be done with finite precision. Modular arithmetic 
in the base ring Z[a]/ {p^, + 1), can be viewed as arithmetic in the ring Qp[a]/ (a™ + 1) keeping 
a precision of e = p~'^ . 

Arithmetic with finite precision naturally introduces some errors in computation. However, the 
nature of Qp makes the error analysis simpler. The field Qp comes with a norm | • |p called the p-adic 

norm, which satisfies the stronger triangle inequality \x + < max (\x\p , \y\p^ |Bak07l Proposi- 
tion 2.6]. As a result, unlike in C, the errors in computation do not compound. 

Recall that the efficiency of FFT crucially depends on a special principal 2M-th root of unity in 
Qp[a]/(a™' + 1). Such a root is constructed with the help of a primitive 2M-th root of unity in Qp. 
The field Qp has a primitive 2M-th root of unity if and only if 2M divides p — 1 |Bak07[ Theorem 
5.12]. Also, if 2M divides p — 1, a 2M-th root can be obtained from a (p — l)-th root of unity by 
taking a suitable power. A primitive {p — l)-th root of unity in Qp can be constructed, to sufficient 
precision, using Hensel Lifting starting from a generator of F* 
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7 Conclusion 



There are two approaches for multiplying integers, one using arithmetic over complex numbers, and 
the other using modular arithmetic. Using complex numbers, Schonhage and Strassen |SS71| gave 
an 0{N ■ log A'^ • log log . . . 2*^^^°^ ^')) algorithm. Fiirer [FiirOT] improved this complexity to 0{N ■ 
logiV-2oa°g*^)) using some special roots of unity. The other approach, that is modular arithmetic, 
can be seen as arithmetic in Qp with certain precision. A direct adaptation of the Schonhage- 
Strassen's algorithm in the modular setting leads to an 0(A^-log A^-log log A^ . . . 2*^^'°^ ^)) algorithm. 
In this paper, we showed that by choosing an appropriate prime and a special root of unity, a running 
time of 0{N -logN ■ 2'^(i°s* ^)) can be achieved through modular arithmetic as well. Therefore, in 
a way, we have unified the two paradigms. 
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